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Abstract 

According to recent findings [||, |2| , empirical covariance matrices deduced from 
financial return series contain such a high amount of noise that, apart from a 
few large eigenvalues and the corresponding eigenvectors, their structure can 
essentially be regarded as random. In Q, e.g., it is reported that about 94% 
of the spectrum of these matrices can be fitted by that of a random matrix 
drawn from an appropriately chosen ensemble. In view of the fundamental 
role of covariance matrices in the theory of portfolio optimization as well as in 
industry-wide risk management practices, we analyze the possible implications 
of this effect. Simulation experiments with matrices having a structure such as 
described in [Q, |2| lead us to the conclusion that in the context of the classical 
portfolio problem (minimizing the portfolio variance under linear constraints) 
noise has relatively little effect. To leading order the solutions are determined 
by the stable, large eigenvalues, and the displacement of the solution (mea- 
sured in variance) due to noise is rather small: depending on the size of the 
portfolio and on the length of the time series, it is of the order of 5 to 15%. 
The picture is completely different, however, if we attempt to minimize the 
variance under non-linear constraints, like those that arise e.g. in the problem 
of margin accounts or in international capital adequacy regulation. In these 
problems the presence of noise leads to a serious instability and a high degree 
of degeneracy of the solutions. 
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1 Introduction 



The concept of financial risk, which attempts to quantify the uncertainty of the outcome 
of an investment and hence the magnitude of possible losses, plays a fundamental role 
in finance today. Portfolio optimization aims at giving a recipe for the composition 
of portfolios such that the overall risk is minimized for a given reward, or, conversely, 
reward is maximized for a given risk. For example, the classical portfolio optimization 
problem formulated first by Markowitz P] relies on the variance as a risk measure and 
expected return as a measure for reward. Since the return on a portfolio is a linear 
combination of the returns on the assets forming the portfolio with weights given by the 
proportion of wealth invested in the assets, the portfolio variance can be expressed as 
a quadratic form of these weights with the volatilities and correlations as coefficients. 
For any practical use of the theory, it will, therefore, be necessary to have reliable 
estimates for the volatilities and correlations, which, in most cases, are obtained from 
historical return series. Actually, volatility and correlation estimates extracted from 
historical data have become standard tools also for several other risk management 
practices widely utilized in the financial industry. 

Recently it has, however, been found by two independent groups [0, |^ that empirical 
covariance matrices deduced from financial return series contain such a high amount of 
noise that, apart from a few large eigenvalues and the corresponding eigenvectors, their 
structure can essentially be regarded as random. In e.g., it is reported that about 
94% of the spectrum of correlation matrices determined from return series on the S&P 
500 stocks can be fitted by that of a random matrix drawn from an appropriately chosen 
ensemble. In view of these striking results, the Markowitz portfolio optimization scheme 
based on a purely historical determination of the covariance matrix would seem to be 
totally inadequate [0, ^, but the credibility of a number of standard risk management 
methodologies would also be shaken. 

In this paper we will argue, however, that the impact of the results of § on the 
portfolio optimization problem may not be as dramatic as one might have expected. 
More specifically, it will be shown in a simulation example that for parameter values 
typically encountered in practice, the risk of the optimal portfolio determined in the 
presence of noise is usually no more than 5-15% higher than the risk without noise. 
Despite the high degree of noise of the covariance matrix, which translates indeed 
into a significant displacement in the weights of the optimal portfolio, the effect on 
the actual risk at the optimum is only of second order and therefore less pronounced. 
This suggests that some of the risk management methodologies based on empirical 
covariance matrices can actually be sufficiently accurate. The main purpose of this 
paper is to point out that the results of [|I|, 0] and the practical usefulness of covariance 
matrices can, in fact, be reconciled. 
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2 Results and Discussion 



We consider the following simplified version of the classical portfolio optimization prob- 
lem: the portfolio variance J27j=i '^i '^ij is to be minimized under the budget con- 
straint Zir=i ""^j = 1) where Wi denotes the weight of asset i in the portfolio while aij 
represents the covariance matrix of returns (considered here as given). This means 
we exclude the riskless bond and seek the minimal risk portfolio in the space of risky 
assets. One might, of course, impose additional constraints (e.g. the usual one on the 
return), but the simplified form at hand provides the most convenient laboratory to 
test the effect of noise. The solution to the optimization problem can then be found 
using the method of Lagrange multipliers, and after some trivial algebra one obtains 
for the weights of the optimal portfolio: 

Y^" —1 

< = (1) 

According to [|l], 0] , correlation matrices determined from financial return series are 
such that apart from a few large eigenvalues and the corresponding eigenvectors, their 
structure is essentially random. Random matrix theory (RMT) allows one to calculate 
different eigenvalue and eigenvector statistics e.g. of a matrix Cij = |; Y.t=i Xjt 
determined from series of random variables xu independent and identically distributed, 
of mean zero and of unit variance {i = 1,2, ... ,n and t = 1,2, ... ,T), see fl], H 
and references therein. The observed deviations of empirical correlation matrices from 
RMT predictions ^, ^, ^ are due to genuine correlations between the financial series, 
while the apparently dominating random part can be interpreted as noise superimposed 
on these correlations. Therefore, any procedure using as input correlation matrices 
determined from financial return series will be biased by a significant amount of noise, 
and for the practical applicability of the procedure it would be highly desirable to know 
the magnitude of this bias. 

In order to get an idea about the magnitude of the effect, we compare the solution 
obtained for a given noiseless covariance matrix a^j^ with that obtained when noise is 
added (we call the new covariance matrix aij). If, for example, alj^ is simply chosen 
to be an n X n identity matrix, noisy covariance matrices aij can be generated as 



cr. 



-1 ^ 

~ rp 

-'■ t=l 



^XitXjt, (2) 



where Xit ~ i.i.d. N(0,1). (Of course, in the limit T ^ oo the noise disappears and 
aij ■) The solutions to the optimization problem in this setup are wf'^* = ^ 

and w* given by Eq. (|l]), respectively. The difference between the two solutions shows 
the displacement of the optimal portfolio due to noise and provides a measure for the 
effect of noise on the optimization problem. 
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Fig. 1: (a) Histogram of nw* illustrating the displacement of the solution in the presence 
of noise for N = 500 and T = 3000 (compare to nwl = 1). To achieve smoothness, the 
histogram has been averaged over 10 samples for aij. (6) Standard deviation of nw* for 
N = 500 and different values for T. 



We have studied the behavior of nw* for different system sizes and different time 
series lengths T. Without noise this quantity is nwf^^* = 1 for any i. In the presence of 
noise, however, nw* oscillates around 1. The distribution of nw* for typical values of 
and T is given in Fig. |T|( a) . It can be seen from the figure that weights as low as or as 
high as 2 appear with non-negligible frequency, which suggests that the effect of noise 
is quite strong, i.e. the optimal portfolio obtained using the noisy covariance matrix 
may be rather different from the "true" optimal portfolio. The standard deviation of 
this distribution as a function of T is given in Fig. |l](6). One can see that the deviation 
from the optimal portfolio remains significant even for quite large T. 
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However, the effect of noise should be assessed not so much on the basis of the change 
in composition, but rather of the shift in the volatihty of the optimal portfolio, because 
this is the only factor rational investors should actually care about in our simplified 
optimization problem. Let us, therefore, compare the variance of the "true" (noiseless) 
optimal portfolio a^^"^ = Y17,j=i'^i^''* '^if ''^f''* — ^^7=1^^^^* with the "true" variance 
of the optimal portfolio obtained in the presence of noise 0"^ = Yl^j=i w* a^f w* = 
Y17=i ^i'^- More precisely, we have calculated the volatility ratio q = that measures 

dp 

the increase in volatility (and therefore decrease in efficiency) of the optimal portfolio 
due to noise. Fig. ^ shows the magnitude of this quantity as a function of T for given. 
It can be seen from the figure that for N = 500 and T > 3000 the increase in volatility 
due to noise is less than 10%. The decrease in efficiency is, in most cases, of the order 
of 5-15% that seems reasonable from a practical point of view (2000 < T < 5000). It 
seems, therefore, that despite its obvious effect on the weights of the optimal portfolio, 
noise has a significantly less pronounced impact on the risk at the optimum. 

1.25 I 1 1 1 1 1 1 1 
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Fig. 2: Ratio q = -?§y quantifying the decrease in efficiency of the optimal portfolio due to 
noise in the covariance matrix as a function of T (N = 500). 

It is interesting to estimate the magnitude of this effect for values of and T 
similar to those considered in [|l], 0. In [|l| daily returns on = 406 stocks of the 
S&P 500 over the period 1991-1996 (a total of T = 1309 daily observations) have been 
used for constructing the correlation matrix. We have found that for this portfolio 
size and for this time series length the value of q is around 1.20, i.e. the risk of the 
optimal portfolio in the presence of noise is about 20% larger than without. As for 
in this paper 30-minute returns on the largest A^ = 1000 U.S. stocks over the two-year 
period 1994-1995 (a total of T = 6448 data points for each series) have been used, for 
which q is about 1.09, i.e. the decrease in efficiency due to noise is only around 9%. 
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According to these findings, the impact of noise on the portfolio optimization problem 
does not seem to be as dramatic as one might have feared and that, despite the high 
level of noise, empirical covariance matrices can still be used as input for a portfolio 
optimization problem without loosing too much of accuracy. 

Moreover, for smaller system sizes we need shorter time series lengths T to 
achieve the same degree of precision. For example, for = 100 and T = 500 the 
increase in volatility is 11%, while for T = 1000 it is only about 5%. Therefore, if 
one uses covariance matrices obtained e.g. from 4 years of daily returns (around 1000 
observations) say on the S&P 100 stocks, the loss in efficiency due to noise in the 
covariance matrix is only about 5%. 

In order to give a better representation of the actual structure of empirical corre- 
lation matrices, we have repeated our experiments with matrices which, in addition to 
the pure random part given by Eq. have one clearly distinct eigenvalue chosen to 
be about 25 times larger than the largest eigenvalue predicted by RMT (see Hi]? 01)) 
with a corresponding eigenvector in the direction of (1, 1, . . . , 1). The displacement of 
the risk associated with the optimal portfolio due to noise has been found to be of the 
same order of magnitude as in the cases discussed earlier. 

The explanation for the lack of a more dramatic effect on the portfolio in the 
presence of noise is actually very simple. A function f{x) with a single well-defined 
flat minimum at x* (e.g. a quadratic function) varies slowly in the neighbourhood of 
the minimum, therefore, the value f{x) need not be much higher than f{x*), even for 
signiflcant deviations of x from the minimum. In our case, the volatility of the portfolio 
is exactly such a function of the weights Wi, and therefore even if the weights deviate 
significantly from the optimal ones (as they do), the risk of the portfolio will not be 
dramatically affected. Since rational investors should not care about the composition 
of their portfolios but only about its risk, the effect of noisy covariance matrices on the 
portfolio optimization problem will be less significant than expected. In other words, 
it appears that despite noise, covariance matrices deduced from financial return series 
can have, in certain reasonable practical use. 

Let us note, however, that the picture becomes completely different if we consider 
an optimization problem with non-linear constraints like those that arise e.g. in the 
case of margin accounts or in international capital adequacy regulation 0, ^. In 
these cases the budget constraint has the form J2i=i'li = 1. As shown in [0], this 
problem maps exactly onto finding the ground states of a long-range spin glass, and 
now the presence of noise leads to a serious instability and a high degree of degeneracy 
of the solutions. The results of have far reaching economic implications, the analysis 
of which are, however, far beyond the scope of this short note. 
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3 Conclusion 



In this paper the impact of noisy covariance matrices on the portfoho optimization 
problem has been investigated. Earlier studies [Q, 0] have pointed out that a large part 
of the spectrum of empirical covariance matrices deduced from financial return series 
corresponds to that of a purely random matrix. The presence of such a high level of 
noise in covariance data could have had devastating consequences for the reliability 
of different risk management practices based on the use of these matrices. We have 
analyzed the impact of this noise on the classical portfolio optimization problem and 
found that the risk of the optimal portfolio determined in the presence of noise is 
typically no more than 5-15% higher than in the absence of it, showing that the 
decrease in efficiency of the optimal portfolio is actually much less dramatic. This 
suggests that the important results of ^ and the practical usefulness of covariance 
matrices can, in fact, be reconciled. 
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